Circulating cell-free and extracellular vesicles-derived microRNA as prognostic biomarkers in patients with early-stage NSCLC: results from RESTING study

Background Factors to accurately stratify patients with early-stage non-small cell lung cancer (NSCLC) in different prognostic groups are still needed. This study aims to investigate 1) the prognostic potential of circulating cell-free (CF) and extracellular vesicles (EVs)-derived microRNA (miRNAs), and 2) their added value with respect to known prognostic factors (PFs). Methods The RESTING study is a multicentre prospective observational cohort study on resected stage IA-IIIA patients with NSCLC. The primary end-point was disease-free survival (DFS), and the main analyses were carried out separately for CF- and EV-miRNAs. CF- and EV-miRNAs were isolated from plasma, and miRNA-specific libraries were prepared and sequenced. To reach the study aims, three statistical models were specified: one using the miRNA data only (Model 1); one using both miRNAs and known PFs (age, gender, and pathological stage) (Model 2), and one using the PFs alone (Model 3). Five-fold cross-validation (CV) was used to assess the predictive performance of each. Standard Cox regression and elastic net regularized Cox regression were used. Results A total of 222 patients were enrolled. The median follow-up time was 26.3 (95% CI 25.4–27.6) months. From Model 1, three CF-miRNAs and 21 EV-miRNAs were associated with DFS. In Model 2, two CF-miRNAs (miR-29c-3p and miR-877-3p) and five EV-miRNAs (miR-181a-2-3p, miR-182-5p, miR-192-5p, miR-532-3p and miR-589-5p) remained associated with DFS. From pathway enrichment analysis, TGF-beta and NOTCH were the most involved pathways. Conclusion This study identified promising prognostic CF- and EV-miRNAs that could be used as a non-invasive, cost-effective tool to aid clinical decision-making. However, further evaluation of the obtained miRNAs in an external cohort of patients is warranted. Supplementary Information The online version contains supplementary material available at 10.1186/s13046-024-03156-y.


Introduction
Lung cancer is one of the most common cancers and exhibits the highest mortality worldwide [1].Significant progress has been made concerning metastatic nonsmall cell lung cancer (NSCLC) thanks to the introduction of targeted therapy and immunotherapy in clinical practice.Of note, these therapies have been incorporated in earlier stages.
Early-stage non-small cell lung cancer (ES-NSCLC) represents only 20-30% of all NSCLC and is characterized by a high survival probability after surgery.However, considering stage IA-IIIA NSCLC, an overall relapse rate of about 50% is observed with wide variations within the same tumor node metastasis (TNM) stage.Most recurrences happen within the first two years after the primary tumor resection, usually at distant sites, and are associated with a dismal 5-year survival of 30% [2].Efforts are being made to identify molecular tumor characteristics and tumor microenvironment (TME) features that discriminate patients at higher risk of relapse [3][4][5].Recurrent stage I lung adenocarcinomas exhibited a higher mutation load and a lower methylation profile with respect to non-recurrent tumors, as well as widespread activation of known cancer and cell cycle pathways.Moreover, recurrent tumors displayed downregulation of immune response pathways, including antigen presentation and Th1/Th2 activation [6].With regard to circulating biomarkers, circulating tumor DNA levels evaluated post-surgery or longitudinally during patient follow-up have been associated with the risk of relapse and prognosis [7], representing a promising biomarker for the evaluation of minimal residual disease.Circulating microRNA (miR-NAs) have also been studied as potential biomarkers, as they are very stable in the bloodstream.Several studies have demonstrated that different miRNA profiles are associated with prognosis [5,8].Nonetheless, very heterogeneous results emerged from the different studies, which could be mainly due to several factors: the use of serum or plasma, the evaluation of the free counterpart of miRNAs or that encapsulated in extracellular vesicles (EVs), the methodologies used for miRNA normalization, the applied statistical approach.Moreover, the improvement of the different biomarkers with respect to the already recognized clinical parameters (such as pathological stage performance status) was rarely evaluated.
The main objectives of this study were: i) to investigate the prognostic potential of circulating cell-free (CF) and extracellular-vesicle (EV)-derived miRNAs in a cohort of surgically treated ES-NSCLC patients, ii) to preliminary assess their predictive accuracy, and iii) to investigate their added value as compared to basic prognostic factors.
Adult patients (> = 18 years of age) diagnosed with a histologically confirmed stage IA-IIIA NSCLC based on the 8th edition TNM classification for lung and pleural tumors [9] surgically resected between November 2018 and December 2020 were included in the study.Patients with concomitant cancer at the time of the ES-NSCLC diagnosis or in the previous 5 years were excluded.A presurgery peripheral blood sample was collected (within 3 weeks by the date of surgery).In particular, n. 2 × 10 ml Cell-Free DNA BCT ® Streck tubes were collected and centrifuged for plasma obtaining by using specific Standard Operative Procedures (SOPs) shared by the different centers.Specifically, tubes were centrifuged at 1600 × g for 10 min at room temperature, with low acceleration and deceleration, and the upper plasma layer was transferred into a clean conical 15 ml Falcon tube and then centrifuged at 2000 × g for 10 min, at room temperature, with low acceleration and deceleration Plasma was stored at -80 °C until the start of molecular analyses.Demographic, clinical, and treatment information were collected from patient's medical records and transcribed on the electronic case report forms (eCRFs) created by using the OpenClinica system.

Isolation of extracellular vesicles and miRNA extraction
Isolation of EVs and associated RNA content was performed as previously described [10].EVs were also characterized according to MISEV guidelines [10,11].Briefly, for each patient, 2 ml of plasma samples were sequentially separated in two distinct fractions of allsizes vesicular RNAs and circulating cell-free RNAs, including miRNAs, by an all-in-one purification kit based on spin column chromatography with siliconcarbide resin separation matrix (Cat.56,900, Norgen Biotek Corp, Ontario, Canada).The extracted RNA was eluted in 50 µL and checked on RNA6000 pico chips using the bioanalyzer 2100 instrument (Agilent Technologies, Milan, Italy).Patient EVs were then characterized for size, plasma concentration, and polydispersity data, by using the NanoSight NS300 tracking system (Malvern Instruments Limited, Cambridge, UK).Samples were diluted in PBS to a final volume of 1 mL and concentration, according to the manufacturer's software manual, within the particles/frame range of 20-120 and the total track to valid track ratio of less than 5 (NanoSight NS300 User Manual, MAN0541-01-EN-00, 2017).Flow cytometry characterization of 37 EV-specific surface markers was done by the MACSPlex Exosome Kit, human (Miltenyi Biotec B.V. & CO.KG, Bergisch Gladbach, Germany), as previously detailed [10].Samples were diluted with MACS-Plex buffer (MPB) to a final volume of 120 μL, 15 μL of MACSPlex Exosome Capture Beads, and 5 uL of each of MACSPlex Exosome Detection Reagent antibodies for ubiquitous EV marker controls (CD9, CD63, CD81) were added to each well.Following buffer incubation and washing steps, flow cytometric analysis was performed with a BD FACSVantage ™ cytofluorimetric (BD Biosciences, Franklin Lakes, NJ, USA).Approximately 10,000 events were recorded per sample.Median fluorescence intensity (MFI) for all 39 capture bead subsets (37 markers plus two background controls) was background corrected (Supplementary Fig. 1).Samples were finally stored at -80 °C until miRNA library preparation.An independent 1 ml plasma aliquot was used to purify intact EVs for nanoparticle characterization.

miRNA profiling
Separate miRNA libraries were prepared for either CF-miRNAs or EV-RNAs using the Qiaseq miRNA library kit (Qiagen, Milan, Italy).Libraries were prepared following the manufacturer's instructions with some adaptations for low RNA inputs, as previously described.Specifically, higher dilution of ligation adapters and RT primers were adopted to reduce adapter dimers formation; PCR cycles were increased (22 cycles), and an additional bead-based purification step was added to remove unwanted small fragments (< 100 bp).Qubit dsDNA HS assay kit (ThermoFisher, Waltham, MA, USA) and DNA high-sensitivity chips on the Bioanalyzer 2100 instrument (Agilent Technologies, Milan, Italy) were used for library quantification and quality checks.Libraries were then normalized and sequenced on the Nextseq550 instrument (Illumina, San Diego, CA, USA).

Bioinformatic analysis
The bioinformatics analysis was developed both in the pre-alignment and alignment steps, followed by the identification of microRNAs.The bioinformatic analysis was developed in two steps.All individual steps and the related tools are shown in Supplementary Fig. 2. In the first phase, the bcl files generated by the sequencer were demultiplexed to generate fastq files.Subsequently, a quality assessment of the reads was performed to ensure the goodness of the subsequent analysis.The next step was identifying the UMIs (Unique Molecular Identifiers) within the reads entered during library synthesis.These molecular tags allow counting the initial small RNA molecules in the starting material and reducing possible biases introduced in the PCR amplification step.As the length of mature miRNAs is known to be around 22-25 bp, reads were subsequently cut to remove reads that were too short (< 18 bp) and too long (> 30 bp).These two steps are essential for the next stage of microRNA identification and counting.We then aligned the reads against the mature miRNA sequence database, miRBase version 22, and generated the count table for subsequent analyses.

Statistical analysis
Data were summarized as mean ± standard deviation (SD), median, and first (IQ) and third (IIIQ) quartiles, as appropriate, for continuous variables and as counts and percentages for categorical variables.The endpoint was disease-free survival (DFS), defined as the time from the date of surgery until the date of disease relapse or death from any cause, whichever occurred first.Patients not experiencing any event were censored at the date of the most recent contact (the last follow-up update was performed on March 31st, 2022).
The Kaplan-Meier (KM) method and log-rank test were used to compare DFS curves between patient groups defined by covariates' levels.Univariable Cox regression models were used to evaluate the direction and magnitude of the association between covariates and the DFS.Results were reported as hazard ratios (HRs) and median DFS, with corresponding 95% confidence intervals (CIs).Median follow-up time was computed by means of the reverse K-M method.For some analyses, the categories of covariates were grouped due to their low frequency.
Before the mainstream statistical analysis, miRNAs were filtered starting from the raw counts matrix; miR-NAs with a third quartile value of fewer than five counts were excluded.Furthermore, samples with a median or first quartile value computed across the miRNAs equal to zero were excluded.Once the miRNAs and samples were filtered out, data were normalized and transformed using the Trimmed Means or M-values (TMM) method and the Blom transformation, respectively.These steps were carried out separately for cell-free and extracellular vesicle miRNAs.
The primary study objectives were to investigate: i) the prognostic potential of the miRNAs alone or in combination with standard prognostic factors such as age at surgery, sex, and pathological stage; ii) their predictive capacity, and iii) their added value in terms of predictive accuracy as compared to standard prognostic factors alone.
To reach these aims, three distinct models were specified: one using the miRNA data only (Model 1), one using both miRNAs and standard prognostic factors (Model 2), and one using the prognostic factors alone (Model 3).Regarding the first two models, given the large number of biomarkers, elastic net regularized Cox regression was used (fixing the mixing parameter to 0.9 and using 5-fold cross-validation-CV-for the identification of the optimal regularization tuning parameter, lambda) and the results reported in terms of beta regression coefficients, as the computation of coefficients' standard errors as well as other quantities (e.g., confidence intervals) within the context of Cox elastic net models is still a problematic issue.For the third model, the standard Cox model was applied.Given the relatively limited number of observations and events, the predictive accuracy of the survival models was evaluated through a 5-fold CV and based on two metrics: the cross-validated Kaplan-Meier curves and the cross-validated time-dependent Receiver Operating Characteristics (ROC) curves.The CV Kaplan-Meier curves should reflect the ability of the survival model to classify patients characterized by a different risk of relapse or death.The number of groups and how to assign patients to each group should be defined in advance.In our study, we decided to consider only two groups and to assign patients to a group or another based on the median value of the prognostic index (given by the linear predictor) computed using the regression coefficients obtained from a model developed on the training portion of the CV process [12].The aforementioned prognostic index was used to calculate the CV ROC curves and the corresponding area under the curve (AUC), defining a landmark timepoint of 24 months.Although the log-rank test is usually used to compare KM curves, in the case of CV curves, it is necessary to resort to permutations to obtain the distribution of the test statistic under the null hypothesis and, therefore, evaluate the statistical significance of the log-rank.
Similarly, a permutation-based test must be performed to test the null hypothesis of an AUC of 0.5.To this end, we proceeded by randomly permuting the correspondence between the disease-free survival times, the censoring indicator, and the clinical covariates and miRNAs and repeating the CV procedure for each permutation.The number of permutations was set equal to 500.To evaluate the added value, in terms of predictive accuracy, of the considered biomarkers with respect to basic clinical factors, we proceeded by testing the difference in the log-rank and AUC values between the combined model (miRNA + known prognostic factors) and the model with known prognostic factors only.Also, in this case, the level of significance was determined by resorting to permutations even if, in this case, only the miRNA vector was permuted.The above-mentioned analysis plan was first carried out taking the two sets of miRNAs separately and then together.
The association between demographic and clinical covariates and miRNAs was tested by means of the Student t-test, and p-values were adjusted using the Benjamini-Hochberg method.
The statistical analyses were performed using STATA 15.0 (College Station, Texas, USA) and R version 4.2.0, (R Core Team, Vienna, Austria) using mainly the following packages: edgeR, survival, glmnet.The R package gplots were used for the graphical representation of selected miRNAs through a heatmap (a hierarchical clustering considering the Euclidean distance measure and the complete agglomeration method was used); the plot was annotated and ordered using the DFS status at 24 months.

Enrichment analysis
Pathway enrichment analysis of genes targeted by relevant microRNAs was performed using mirNET (https:// www.mirnet.ca) [13].Briefly, miRTarBase v8.0 and TarBase v8.0 were adopted by the software to select microRNA-gene interactions validated experimentally.Then, a network was created, and the minimum network filtering approach was used to reduce network size and keep the main connection patterns.Reactome was then interrogated for pathway enrichment of the network.
Pathway enrichment was performed starting from a single list of CF-and EV-miRNAs associated with DFS from the mainstream statistical analyses (Model 1) and considering separate lists for each type of miRNA.Additional analyses were carried out according to the sign of the miRNA's relationship with the DFS and the sign of the models' regression coefficients.Moreover, all analyses were repeated, adding miRNAs correlated (with a Pearson's correlation coefficient of at least 0.6 in absolute value) with those selected by the mainstream analyses.

Demographic and clinical characteristics
A total of 222 patients were enrolled in the study.One hundred and thirty-three patients were males (59.9%), the median age at surgery was 70 years (IQ-IIIQ: 63-75), and the majority (86.1%) had a history of smoking, Table 1.In addition, most patients (95.8%) had an ECOG performance status (PS) of less than or equal to 1.One hundred and thirty-seven patients (64.3%) had a pathologic stage I tumor, 36 (16.9%) a stage II tumor, and 40 (18.8%) a stage IIIA cancer.Adenocarcinoma was the most common histologic type (81.0%).Twenty-one (9.5%) patients received a pre-surgery neoadjuvant therapy, whereas 31 (14.7%) and 14 (6.6%) patients received adjuvant chemo-and/or radiotherapy, respectively.

MiRNAs and demographic and clinical covariates
Supplementary Fig. 3 shows the patients' disposal for the main statistical analyses of the study.The prognostic role of the biomarkers was first investigated for the CF-miRNAs and EV-miRNAs separately and then together.The unsupervised filtering of patients and miRNAs was done separately once and for the two types of miRNAs.Thus, the patients' biological samples that passed the quality check were 176 and 171, whereas the retained miRNAs were 438 and 232 for CF-and EV-miRNAs.

Prognostic potential of CF-and EV-miRNAs
The information on DFS was available for 195 (88%) patients.The median follow-up time was 26.3 months (95% CI: 25.4-27.6),and the median DFS was not reached (NR).A total of 52 events were observed.Table 2 shows the results obtained by fitting three distinct models, as described in the Statistical Analysis section.Models 1 and 2 were obtained using elastic-net penalized Cox regression, whereas Model 3 by means of standard Cox regression.
Model 1 aimed at evaluating the prognostic potential of the miRNAs alone.From this analysis, three CF-miRNAs and 21 EV-miRNAs were associated with DFS (had beta coefficients different from zero).Most of them reported a negative regression coefficient indicating that as the expression of the miRNA increases, the hazard of relapse or death decreases.The performance of this model is reported in Fig. 1, panels A and D, for the analyses on cell-free miRNAs, and in Fig. 2, panels A and D, for those on extracellular derived miRNAs.The model derived from the CF-miRNA data showed, as compared to the models derived from the EV-miRNAs, better performance in terms of both CV Kaplan-Meier curves separation and discriminatory accuracy (AUCs were 0.67 and 0.59, respectively).
The second and third models were fitted especially to investigate whether the miRNA data added predictive accuracy to a model, including readily available information such as age, sex, and stage of disease.The second model was derived from the miRNA, demographic, and clinical data.In this model, two CF-miRNAs (miR-29c-3p and miR-877-3p) and five EV-miRNAs (miR-181a-2-3p, miR-182-5p, miR-192-5p, miR-532-3p and miR-589-5p) remained associated with the DFS.In particular, higher expression of the cell-free miR-29c-3p and the two extracellular vesicle miRNAs, miR-182-5p and miR-192-5p, was associated with a worse prognosis.Conversely, a higher expression of the cell-free miR-877-3p and of the extracellular vesicle miRNAs, miR-181a-2-3p, miR-532-3p, and miR-589-5p, was associated with a better prognosis.The performance of these combined models was almost comparable in terms of curve separation but slightly better for the model including extracellular derived miRNAs in terms of AUC, equals to 0.68 as compared to a value of 0.62 obtained from the combined model including cell-free miRNAs, Figs. 1 and 2, panels  B and E.
The third model included age at surgery, sex, and the pathologic stage of disease alone and was fitted to investigate if the molecular information can significantly improve the predictive accuracy of known prognostic factors routinely available in clinical practice.Age at surgery was included as a 1-unit increment continuous covariate, and the stage was grouped into three main categories.Table 2 reports two hazard ratios for each of these factors as they were estimated in the two separate datasets, one with CF-miRNA data and one with EV-miRNA data that included a slightly different number of patients due to the filtering procedures described in the Statistical Analysis section and cited above.  1 shows the results of univariate Cox models fitted on the 195 patients with available DFS data, whereas Supplementary Fig. 5 the DFS curves for each stage category.Given the small number of patients within each subtype of stages I, II, and III, in Model 3 reported in Table 2, only the three main categories for the stage were considered.
Thus, the model including only the clinical factors, Model 3, showed per se good predictive performance, as shown in Figs. 1 and 2, panels C and F.
The combined models (one for CF-and one for EV-miRNAs), containing both miRNAs and clinical factors (age, sex, stage), did not show significantly higher performance compared to the model including the nonbiological factors alone (Models 2 vs. Models 3).From the comparison of the CV Kaplan-Meier curves derived from these models (Fig. 1, panels B and C for CF and Fig. 2, panels B and C for EV), and of the AUCs of the CV ROC curves it emerges that miRNAs do not provide additional survival risk discrimination to that already provided by basic covariates (the permuted log-rank p-value for the comparison of the two models was equal to 0.640 and 0.248 for CF-and EV-miRNA, respectively, whereas the p-value comparing the AUCs was equal to 0.076 and 0.652 for CF-and EV-miRNA, respectively,).
Repeating the analysis, considering the two types of miRNAs together (CF-and EV-derived), we obtained the results shown in Supplementary Table 2.These models were fitted on 156 patients (41 events), as reported in Figure S6.Model 1 included 14 miRNAs, most of them, 9, already found associated with DFS in the separate analyses described above.Based on this model, modestly separated survival risk groups were obtained (permuted log-rank p-value = 0.061) and an AUC of 0.62 (permuted p-value = 0.021), Supplementary Fig. 6 panel A  From analogous subgroup analyses on stage I patients, 10 CF-miR were associated with DFS.For most of them (miR-135a-5p, miR-877-5p, miR-107, miR-1226-3p, miR-362-5p, miR-3913-5p, miR-548ax), increasing expression values associated with a reduced hazard of relapse or death whereas for miR-345-5p, miR-9-3p, the opposite was observed.When considering also the standard prognostic factors, four CF-miR were retained in the model (miR-135a-5p, miR-107, miR-548ax, and miR-9-3p); the direction of the association was the same as before.The models' performance was not satisfactory and there was no evidence of an incremental predictive capacity compared to age and sex alone, results not shown.This is perhaps due to the small size and the immature followup length for this subgroup of patients.No EV-miR was found to predict DFS in this population.

Pathway enrichment
Pathway enrichment was first conducted by analyzing the 3 CF-miRNAs and the 21 EV-miRNAs associated with DFS from the mainstream statistical analyses (Model 1).Interestingly, TGF-beta and NOTCH were the most enriched among the significant pathways found (FDR < 0.05).In addition, at a lower rich factor, involvement in the cell cycle and immune regulation could be identified (Fig. 3A).To further dissect the role of miRNA specifically involved in these pathways, we deepened the enrichment analysis using an expanded list of miR-NAs, considering also those moderately to highly correlated with the one listed above (the list of correlated miRNAs is reported in Supplementary Table 3).Pathway enrichment, with or without including correlated miR-NAs, demonstrated only minor changes (data not shown).However, it significantly improved the enrichment in the subsequent analysis, wherein the original list of miRNAs was further divided into additional subtypes.
Then, we evaluated separately the pathways that resulted in being enriched by miRNAs derived from CF and those derived from EV.In particular, considering both miRNAs deriving from the mainstream analysis and their correlation, we found that miRNAs involved in regulating genes participating in TGF-beta and NOTCH pathways were enriched in the EV fraction.In addition, miRNAs targeting components of tyrosine kinases and of PI3K signalling pathway can be found enriched in EV.Conversely, the miRNAs identified in the CF fraction were predominantly involved in more comprehensive pathways, such as the cell cycle and immune system (Supplementary Tables 4-5).
Thus, an additional analysis focusing on the EV fraction was performed.That is, considering the EV-miRNAs with positive (higher expression values associated with a shorter DFS or higher instantaneous hazard of relapse or death) and negative regression coefficients (higher expression values associated with a longer DFS or lower instantaneous hazard of relapse or death) separately.For this step, positively correlated miRNAs were added to each miRNA list before enrichment analysis.MiR-NAs with positive regression coefficients were shown to target genes involved in vesicle transport, mitotic activities, and the immune system.On the other hand, miRNAs with negative regression coefficients were found to have a stronger association with TGF-beta/ SMAD2/3 and NOTCH pathways (Fig. 3C, Supplementary Tables 6-7).In particular, 8 miRNAs were involved in the enrichment of these pathways (miR-18a-5p, miR-370-3p, miR-628-5p, miR-125a-5p, mir376c-3p, miR-381-3p, miR323a-3p, miR-409-3p) through the targeting of TGFBR2 and SMAD2.Accordingly, the expression level of these 8 miRNAs was globally lower in patients with worse outcomes (Fig. 3C); in particular, 5 of these (miR-125a-5p, miR-370-3p, miR-628-5p, miR-381-3p, miR323a-3p) were associated with DFS (Fig. 3D).

Discussion
Circulating miRNAs have attracted much interest in cancer diagnosis and prognosis as they are small and very stable in circulation and might have essential functions.For these reasons, they are good candidates as molecular biomarkers capable of selecting patients at higher risk of recurrence.However, very conflicting results regarding their role have been reported, leaving doubts about their real value as prognostic markers.The heterogeneous results are mainly due to different aspects: the panel of miRNAs analyzed (small panel of miRNAs or miRNome analysis), the methodology used for miRNA evaluation and normalization, the statistical approach applied, the study endpoint, the specimen used for the analysis (plasma or serum) and methods of collection.Moreover, "circulating miRNAs" can be different whether we consider EV-derived miRNAs or the overall circulating miRNAs content, as it is well established that EV-derived miRNAs could exert more specific functions in the context of lung cancer, with respect to those released in the circulating compartment [14,15].Furthermore, despite numerous studies that have analyzed miRNAs as potential prognostic factors, it is not clear the added value of miRNAs with respect to the established prognostic factors used in the clinic, mainly the stage of the disease.
To address some of these aspects, we performed a prospective multicentre study to clarify the role of circulating miRNAs as prognostic biomarkers in ES-NSCLC, focusing on the added value of the miRNAs with respect to the known clinicopathological prognostic parameters.
From the model built on the miRNA data, we found that 3 CF-miRNAs and 21 EV-miRNAs were associated with DFS without considering the integration of known clinical prognostic factors.Most of them exhibited a negative regression coefficient, meaning that their higher expression values are associated with a better prognosis, pointing out a tumor-suppressor function of the specific miRNAs.
Concerning the CF-miRNAs, miR-135a-5p, and miR-877-3p were overexpressed in patients with a lower risk of relapse or death, which aligns with their reported biological function.It was shown that miR-877 overexpression repressed NSCLC cell growth by targeting tartrate-resistant acid phosphatase (TRAP), also known as acid phosphatase 5 (ACP5), and inhibiting the PI3K/ AKT pathway [16].Concerning miR-135a-5p, it has been shown in several studies that its expression decreased in different solid tumors such as glioma [17], gallbladder [18], colorectal cancer [19], and also lung cancer [20].Conversely, we observed that CF miR-29c-3p upregulation was associated with worse DFS.This observation goes along with its potential role as a tumor promoter, recently shown in ovarian cancer models [21].
Cell-free miRNAs can be released and uptaken by cells through vesicle trafficking and protein carrier mechanisms, and they are able to function as gene expression regulators in cell to-cell communication mechanisms under normal and pathological conditions, such as cancer [35].However, we have to consider that miRNAs are released into the bloodstream from different cell types, not only from tumor cells [36], and this still unclear aspect leaves some open questions regarding the real role that these miRNAs found in the blood circulation may have.Moreover, miRNAs can target several different genes and as a consequence can affect different pathways.Taking this into consideration, a pathway enrichment analysis was performed to reach a global overview of the different pathways on which the significant miRNAs are involved.
TGF-beta/SMAD and NOTCH are well-known prognostic pathways in NSCLC [42][43][44][45].In particular, a specific association between TGF-beta expression and risk of relapse in ES-NSCLC has been shown [46].With regard to NOTCH, a previous study showed that specific polymorphisms of the gene are associated with survival rates in ES-NSCLC [47].Moreover, an in vitro study demonstrated that the NOTCH signaling significantly affects the growth and the malignant phenotype of both colorectal and lung models [48].Interestingly, an enrichment of the PI3K pathway was also observed in EV-derived miRNAs.The PI3K pathway involvement in the prognostic risk determination of ES-NSCLC was already shown in previous studies [49,50], in accordance with our results.Overall, we found an evident enrichment of TGF-beta/SMAD, NOTCH, and PI3K pathways in EV-derived miRNAs.
This reinforce the already demonstrated role of EVs as components with specific functional activities in the regulation of cell growth, which consequently can have important prognostic and predictive roles in response to therapies [51].It has been shown that both tumor and immune cells can release specific EVs containing components, mainly miRNAs, with specific functions able to regulate cancer-specific processes such as epithelial-mesenchymal transition [52,53], neovascularization [54,55], anti-tumor immune cell function [56,57].Hence, we can conclude that EV-derived miRNAs seem to have more specific functions with respect to CF-miRNAs, which could be derived by EV itself but also from other, more non-specific sources, such as necrotic or apoptotic cells.
To evaluate the added predictive value of the miRNAs to basic prognostic factors, statistical models combining these and disease stage, age, and sex were specified and then compared with a model including only the demographic and clinical factors.In the combined models (one for CF-and one for EV-miRNAs), two CF-miRNAs (miR-29c-3p and miR-877-3p) and five EV-miRNAs (miR-181a-2-3p, miR-182-5p, miR-192-5p, miR-532-3p and miR-589-5p) remained associated with DFS and we did not observe significantly higher predictive performance compared to the models including the clinical factors alone.However, when we derived the combined model considering the data from both types of miRNAs, a significant increase in the prognostic accuracy was found.In particular, 3 CF-miRNAs (miR-135a-5p, miR-29c-3p, miR-877-3p) and one EV-miRNA (miR-192-5p) gave a substantial contribution in addition to the clinical factors.
Our study has several strengths and limitations.Among the primary strengths are i) the prospective collection of biological specimens and clinical data, ii) the analysis of the miRNome and differentiation of miR origins, iii) Moreover, despite the relatively small sample size and the lack of an external validation cohort, we were still able to preliminarily evaluate the performance of the fitted models, as well as, through the approach proposed by Simon et al. [12], to investigate the added predictive value of the miRNAs as compared to basic non-biological prognostic factors, in particular the stage of disease, something that is not often reported in the published studies.
As mentioned above, one limitation of this study is the modest sample size and the limited follow-up time (the median was about 26 months), especially given the high proportion of stage I cancers in our cohort (64.3%).Both these factors limited the possibility of stratified analyses and their statistical power, leaving several questions unanswered, for example, whether different miRNAs are involved in the prognosis prediction by disease characteristics (e.g., stage and histotype) or if their prognostic effect may vary within strata.However, enrollment is ongoing in another funded study on ES-NSCLC carried out by some of the RESTING centers, which will permit us to validate our data.Another limitation is relating to the uncertain source of release of the miRNAs found in circulation, which leaves some open questions regarding their functional role in cancer development.Moreover, we are conscious that the different methodologies available for EV isolation could maintain a portion of small lipoproteins, that could interfere with the subsequent analyses.However, we used an EV purification method that relies on the specific binding of EV surface proteins to negatively charged silicon membranes, exploiting differences in pH and isoelectric points compared to other proteins like histones and lipoproteins in biological fluids.This approach allows for the selective adsorption and subsequent elution of EV while effectively excluding most lipoprotein contaminants, as demonstrated in existing literature [58].Given that there are no gold standard procedures for EV isolation, some open questions remain about the optimal methodology to be used.However, our approach seems to be reproducible and robust with some potentiality to be included in the clinical practice in the future.
In summary, our study highlights potential miR-NAs able to predict the risk of relapse after surgery in patients with ES-NSCLC.When considered separately, CF-and EV-miRNAs could not significantly improve the prognostic performance of the model with known clinical prognostic factors.However, when considering them together, a significant improvement was observed.Thus, the identified circulating miRNAs could represent a non-invasive approach that could permit clinicians to have further information to decide the most appropriate patient's management.

Fig. 1
Fig. 1 Performance of the models using cell-free miRNA data.From (A) to (C) 5-fold cross-validated Kaplan-Meier curves for the models derived using the miRNA data alone, miRNA and and basic prognostic factors (age, sex, and pathologic stage) data, and the prognostic factors data alone.From (D) to (F) Five-fold cross-validated time-dependent ROC curves (at 24 months) for the three models.DFS disease-free survival, TP true positive, FP false positive

Fig. 2 Fig. 3
Fig. 2 Performance of the models using extracellular vesicle miRNA data.From (A) to (C) 5-fold cross-validated Kaplan-Meier curves for the models derived using the miRNA data alone, miRNA and basic prognostic factors (age, sex, and pathologic stage) data, and prognostic factors data alone.From (D) to (F) Five-fold cross-validated time-dependent ROC curves (at 24 months) for the three models.DFS disease-free survival, TP true positive, FP false positive

Table 1
Baseline patient characteristicsPercentages may not equal 100 due to rounding SD standard deviation, IQ first quartile, IIIQ third quartile, ECOG Eastern Cooperative Oncology Group, PS performance status

Table 2
Multivariable Cox proportional hazards regression models on disease-free survival (separate analyses for CF-and EV-miRNA) Model 1 refers to the results obtained fitting an elastic net penalized Cox model on the microRNA data only; Model 2 refers to the results obtained fitting an elastic net penalized Cox model on the microRNA and basic prognostic factors data; Model 3 refers to the results obtained fitting a standard Cox model only on the data of basic prognostic factors CF cell-free, EV extracellular vesicle, HR hazard ratio, CI confidence intervals, pSTAGE pathologic disease stage, Coef beta regression coefficient